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SNOBOL 


SNOBOL is a programming language for manipulating strings 
of characters. SNOBOL'S simple statement formats, 
simplified Input-Output and automatic storage allocation 
makes it easy for the novice programmer to learn. On 

the other hand, the power of SNOBOL's commands for 
character string manipulation allow elegant and 


sophisticated programs to be written in SNOBOL. SNOBOL 


was developed by Farber, Griswold and Polansky of the 
Bell Telephone Labs in 1962, and was implemented on the 


IBM 7090. There are SNOBOL languages currently implem- 


ented on a number of machines. Write-ups of SNOBOL 


are contained in SNOBOL, "A String Manipulation 
Language", Journal of the Association of Computing 
Machinery, Vol. 11, No. 2 (January, 1964), PP. 21-30, 
and "The SNOBOL3 Programming Language", The Bell System 
Technical Journal, Vol. XLV, No. 6, July-August 1966. 
See "SNOBOL3 Primer", by Allen Forte, 1967, MIT Press 


for a simple, clear primer in paperback. 


While FORTRAN deals mainly with numbers, SNOBOL deals 
with strings of characters. In FORTRAN a variable usually 
contains a fixed or floating point number. In SNOBOL, 


a variable contains a pointer to a string of characters. 


a ae 


The Applied Logic implementation of SNOBOL has a simple 
means of doing Input/Output using the Teletype and the 


Disc (or Drum) for reading and writing character strings. 


Before going into elaborate detail on the syntax of our 
implementation of SNOBOL let us examine three simple 
examples of its use. 

EXAMPLE ONE 

Example 1] we consider the task of printing a vertical 
list of words contained in a sentence inputted on the 
Teletype. Let us assume that a blank, comma-blank, 

and period are the legal punctuation separating words. 
Below is a listing of the SNOBOL source for a program 
implementing this task. The compiling, assembling, 

and loading needed to run the program are shown. Follow- 
ing the example is a detailed explanation of the action 


of each line of SNOBOL coding. 
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EXAMPLE -1- 


sEXAMPLE #1 
sINPUT SENTENCE FROM FELETYPE ANG MAKE VeEnTICAL LIS] OF Wunes. 
sCONSIDER GLANK, COMMA-BLANK, AND PERKOIL AS LeGAL FUNCTUATION. 


STARTS 


PANSES$ 


LASTS 


eh SNODOL 


*EX 1. TEMCEXAMP 1 


oNTYPE “*” sTYPR *« WITHOLT CARRIAGE KET*LINe Feou. 
eACCEPT LINE sREAD SENTENCE FROM TELETYPE 
LINE. *WOKLD* “ "IE", "I" W” = /¥F (LAST) 
eTYPe WOnv ;TYPE WORD AND CAnKIAGE RETURN LI¥2 FEeL 
/ (PARSE) 
eoTYPe LINE sTYPe REMAINDER OF LINE 
sPKESUMABLY EMPTY. 
/(START) 
eo END 


NO UNDEFINED SNOSOL AUDRESSES 


a0n4 8 EXAMF leunMs EX1. TEM 


InERE ARE NO ERRORS 


FrOGRAM oREAK IS 080131 


5” CORE USED 
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oi LOADER 
#oNCOPS ,EXAMP1.RKEL,/G 


‘ JAUDER 
CUne | 


cXl] 
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Explanation of Example 1 


A semicolon and characters to the right of a semicolon are con- 
sidered to be a comment by the SNOBOL compiler and are ignored. 
Exception: a semicolon between matching quotes is considered as a 
character in a string. Lines 10, 20, and 30 represent SNOBOL 
lines which are comments. Lines 50, 60, 80, 100, and 110 show 
the use of comments at the termination of SNOBOL lines. 


Blank lines are ignored by the SNOBOL compiler and can be used 


‘by the programmer to improve the readability of his coding. For 


example, see Line 40. 

START , PARSE , and LAST are words used as address labels in 
Example 1 in specifying the logical flow of the program. LINE and 
WORD are words used to name string variables. A label or a 
string variable is any word made up of letters, numerals, period, 
percent sign, or dollar sign which does not start with a numeral or 
a period. A label can beens number of characters, however only 
the first six characters of a label or a variable are used by the 
compiler. Words which start with period are reserved for various 
SNOBOL command functions. A label followed by a colon at the be- 
ginning of a SNOBOL line defines the label to be equal to that physi- 
cal location in the program. For example, see Lines 50, 70, and 
100. Such labels serve the same purpose as statement numbers in 
a FORTRAN program. String variables such as LINE and WORD 
are defined merely by their appearance. For example, LINE is 
defined by its appearance in Mie 60; WORD is defined by its appear- 
ance in Line 70. As in FORTRAN, the appearance of a variable is 
sufficient to define it. Care must be taken not to use the same word 


for a string variable and a location label. 


Blanks and tabs are used as delimiters in a SNOBOL line as needed 
or desired. For example, in Line 50 no delimiters are needed, 
while in Line 60 .ACCEPTLINE would be ambiguous without the 
blank between .ACCEPT and LINE. Note the use of tabs in 
Lines 50 through 130. 

The execution of a SNOBOL program starts with the first executable 
SNOBOL instruction--in this example, Line 50. The label and the 
command have already been explained. The executable portion of 
-the snareuction is the SNOBOL command .NTYPE ''*"' which re- 
quests that the literal string consisting of an asterisk be typed. The 
second executable SNOBOL instruction requests that a line be in- 
putted from the Teletype and the string of letters corresponding to 
the line typed be stored and a pointer to this character string be 
stored in LINE. Though, in fact, LINE contains a pointer to a 
string of characters, for almost all purposes the programmer can 
think of line ‘containing a string of characters. In the explanations 
of the examples we will say that a variable contains a string, when 
in fact it will contain a pointer to a string. The carriage-return- 
line-feed which normally terminates a typed line are deleted from 
the end of the string of characters before being stored. See the 
sample runs for the asterisk outputted and the sentence inputted by 
the user. 

In Line 70 we have a SNOBOL instruction which does not involve 
input/output and which is more typical of SNOBOL instructions The 
label PARSE is defined as is described above. The next element 
of the line is a string variable and indicates that some action 1s to 
be taken with the string LINE. The second element *WORD: . 
is used as a filler in a string matching process which is at the 


heart of SNOBOL. The third element, " "!" , "t'l)" represents the 


disjunction of three literal strings consisting of a blank. comma- 
blank, and period. The exclamation point can ‘be read as an "or". 
A fourth element is the equal sign. This instruction has the fol- 
lowing action: 

Step A 

WORD is set equal to the null (empty) string. 


Step B 
The first (next) character(s) of LINE’ is(are) examined to see 


if it is a blank, comma-blank, or a pericd. If it is go to Step E. 


Step C 
The character just examined from LINE is appended to the 


string in WORD. 


Step D 
If there is a next character in LINE go to Step B. If there is 
not, a special dedicated accumulator--called the ‘'test accumula- 


tor''--is set to '"fail'' and terminate the action of the instruction. 


Step E 

The initial portion of LINE which matches WORD and the 
blank, or the comma-blank, or the period, is replaced by the 
string named to the right of the equal sign. In this example there 
is no string to the right of the equal sign so these intial chirac- 
ters of LINE are simply deleted. The test accumulator is set 


to "success". 


The /F(LAST) which terminates Line 70 indicates that the pro- 
gram is to transfer to the instruction labelled LAST if the test 
accumulator was set to fal. In the other case, that is, the case 
where the string match was successful, and hence the test accumu- 


lator was set to success, the next instruction (Line 80) 1s to be 


10. 


executed. Transfer instructions such as /F(LAST) can appear 
to the right of SNOBOL commands or can stand alone on line as 
SNOBOL instructions. For example, Line 90 is an uncond.tional 


transfer to the instruction labelled PARSE. 


Line 80 is a command to type the string in WORD on the Tele- 
type. A carriage-return line-feed is appended at the end of each 


string typed by the .TYPE command. 


Line 90 is an unconditional transfer back to the instruction labelled 


PARSE. Looking at the first sample run, at the first time Line 90 
is executed, WORD isa five-character string, TODAY and 
LINE is the string IS THE DAY FOR ALL GOOD MEN TO COME 
TO THE AID OF THEIR COUNTRY. In the first sentence tested 


the loop from instructions 70-90 is executed 16 times and on the !]7th 


execution of Line 70, LINE is the null string and hence the string 
match requested fails and the program transfers to the instruction 
labelled LAST. 

Line 100, which is the instruction labelled LAST, is a request to 


type the contents of LINE. If a "legal" sentence has been typed 


at Line 60, LINE will, in fact, be the null line and this accounts 


for the blank line appearing in our sample run between COUNTRY 


and the * which indicates to the user that a second sentence is t: 


be typed. 

Line 120 is an unconditional transfer back to the start of the pro- 
gram. Line 130 contains the SNOBOL command .END This 
command signals the end of the source to the SNOBOL compiler if 
this statement is omitted the compiler gives a warning that no 


~END was found and an .END is assumed. 


A detailed description of the SNOBOL language will appear below. 
Let us comment briefly on the compiling, assembling, loading, and 
execution of Example 1. The Teletype output from this sequence of 
operations is shown following the source language for Example 1 above. 
The .~.R SNOBOL calls in the SNOBOL compiler from the SYS _ and 
begins its execution. . It requests a command string from the user by 
typing an asterisk. No devices should be given in the command string 
since the DSC (or DRM) is always assumed. The compiler creates 
a MACRO assembly language program which in our example we have 
called EX1.TEM. The compiler has told us that here are no unde- 
fined SNOBOL addresses, and returns to the monitor. The MACRO 
assembler is then called in from SYS and the MACRO source file 
EX1.TEM is assembled. In the example we have called the assembled 
file EXAMP1.REL. The loader is then called from SYS and ovr re- 
locatable file, EXAMP1.REL and the SNOBOL operating system, 
SNOOPS.REL from the SYS are loaded. The SNOBOL operating sys- 
tem is a collection of subroutines which is called by the MACRO coding 
generated by the SNOBOL compiler from our SNOBOL source coding. 
Observe that this sample program and the entire SNOBOL operating sys- 


tem load in IK. 


EXAMPLE TWO 

As a second example let us consider the SNOBOL program below 
which accepts a FORTRAN IV source file name from the Teletype, gets 
in that file from the Disc (or Drum) and replaces the leading spaces 
from linesin card format,by a tab for Teletype format. The algorithm 
which we use is as follows: 

If a line from the source file does not contain five characters it is 
assumed to either be a blank line or already be in Teletype format. if 
'a line contains at least five characters and has a tab among the first 


five characters then the line is considered also to be in Teletype format, 


otherwise the line is considered to be in card format. The first five 
characters of a line are removed and stored in FRONT. All blanks 


in FRONT are deleted and a tab is appended to the right-hind end of 


FRONT. These two maneuvers work if there is or 1s not 3 statement 
number. If the sixth character of the line, the continuation character, 
is blank, it is deleted. Otherwise the sixth character is replaced by ' 


the numeral 1. The line is then outputted to the Disc (or Drum) and 

the next line processed. This program creates a temporary out- ° 
put file for the lines as they are processed. This file is called 

QQTAB.TEM. After the file is successfully translated the orig:nal t:le 

is deleted and QQTAB.TEM is renamed to have the name of the origin 

al FORTRAN IV source file. 


_ Explanation of Example 2 ‘ 

Lines 10, 30, 40, and 50 are comment lines. Lines 20, 60, and ; wa 
140 are blank lines. Line 70 types an asterisk without a terminating 
CR-LF on the Teletype. Line 80 accepts a line of input from the T+le- 
type, strips off the trailing CR-LF and stores the resulting string in 
FILNAM . Line 90 contains the SNOBOL command to open the input file 
whose name is stored in FILNAM . If the file named is not on the 
Disc (or Drum) the SNOBOL operating system (SNOOPS) complains and 
sets the test accumulator to "fail". The /F(START) terminating I:ne 
90 causes a transfer back to Line 70 if the file is not found. otherwise 
control passes to Line 100. 

In Line 100, a Disc (or Drum) file is opened for output with the 
name QQOTAB. TEM . 

Line 110 through Line 200 consititute a loop which reads a line uf 
. source, processes it, and writes it back out on QQTAB.TEM . In 


Line 110, the SNOBOL command .READ LINE reads the next line of 


EXAMPLE -2- 
eh PIP2 


a TTY seDRMs EXAMP2 

12210 sEXAMPLE 2 

YoV20 

30939 snEMOVE BLANKS, INSERT TABS 

292340 sCcLETE BLANKS IN COLUMN 6 OR REPLACE 
02052 sNUN-BLANKS WITH I. 


QAA62 
O2076 START: .NTYPE "*” sTYPE * WITHOUT CASEIASE SET-LIVE Flu. 
33882 eACCEPT FILNAM $READ FILE NAME 
20292 OPIN FILNAM /F (START) 
2d102 eOPOUT “QQTAB.TEM™ 
WIL12 LINLUP: .READ LINE /FCEXIT) 
IS120 LINE *FRONT/5*% = /F COUTLN2) 
32139 FRONT ”* : /SCTEST2) 
JO1 42 
30156 SLNK: FRONT ™ ™ =. /S(3LNK) 
22162 FRONT = FRONT ” * sAPPED A TAS 
oel73 TEST2: LINEe " "” = /SCOUTLIN) 
JI18Q LINE *CHAR/I* = "1" 
22193 OUTLIN: .WRITE FRONT LINE /CLINLUP) 
32200 OUTLN2: .WRITE LINE /CLINLyp) 
Q221@ EXIT: .RIN ™" 
8220 | eROUT FILNAM /CSTART) 
W235 e END 
*TTY s¢ORM: TEST 
SUM=0,. 
DO 1d I1=1,122 
C FIND THE SUM OF THE NUMBERS FROM | TO 100. 


SUM = SUM + FLOATCI) 
12 CONTINUE 
TYPE 9,SUM 
9 FORMAT(3X,°SUM OF THE NUMBERS FROM 1 TO 19¢ IS’ 
1 F7.2//) 
CALL &XIT 
END 


xTC 
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EXAMPLE -2- (cont'd.) 


ae 


#7 X9029 AGCer XA Me? 


NJ ‘NOU FE WE) SNC39L ADDRESSES 


EXIT 
tC 


ot “ACKO 


*ITMEEXAMPA. KEL, HUM = XAWMP2, MAC 


THenF ARE No ERRORS 
PAOGRAM 3REAK IS 4393225 
5K COre JSced 

* TC 


( LOADER 
SSSNOOPS ,EXAMP2, §EL/I/% 


LOADER 
CORE 4 
EXIT 
tC 
SAVE ORM FXAYPD 
J03 SAVED 
tc 
eOoTART 
*TEST 
* tC 
on PIPil 
*TTY¥Ys-ORMS TEST 
SuUM=3. 
. 00 19 [71,130 
C FIND THE SUM Or THE NUMBERS FROM 1 TO 109, 
SUM=SUM+FLOATCI) 
) CONTINUE 
TYPE 9,S'!™ 
9 FORMAT(3X%,°SUM OF THE NUM3ERS FROM 1 TO 193 I1S° 
1 P7.2//) 
CALL EXIT 
END 


Cae & 
* TC 


the input from the Disc (or Drum) input file. The trailing CR-LF is de- 


leted, and the resulting string stored in LINE. If no next line 
exists,. the test accumulator is set to 'fail'' otherwise the test accumu- 
lator is set to "'success'. In Line 110, the transfer to Line 210 is 


effected when one attempts to read the last-plus-one line. 

In Line 120, the first element is the SNOBOL variable Line . This 
indicates to the compiler that this line is a form of string matching. In 
this example there is only one element between the first element LINE 
and the equal sign. This element *FRONT/5* indicates that five 
characters from LINE’ should be copied into FRONT. The egual sign 
has no element to its right, so the first five characters of LINE ill 
be deleted. The effect of Line 120 is that if LINE has five or more 
characters the first five characters will be removed from LINE and 
this five-character string is stored in FRONT and the test accumu- 
lator is set to success; if LINE’ has four or fewer characters, then 
LINE remains the same and the test accumulator is set to "fail." In 
the case that LINE does not contain at least five characters, transfer 
passes to Line 200 where the line is written out on QQTAB.TEM and 
control is then passed to the top of the loop at Line 110. If LINE had 
five characters control transfers to Line 130. 

The effect of Line 130 is that the five-character string in FRONT 
is. searched for a tab. (The string between the two quotes may look 
like blanks; it is in fact a tab.) Notice in Line 130 that there is no 
equal sign. Line 130 is an example of string matching whose sole pur 
pose is for program control. In this case, if FRONT contains a tab. 
control is transferred to Line 190. Let us assume that FRONT did 
not contain a tab. In this case the original line must have been in card 


. format. Hence, control passes to Line 150. 


Line 150 is an example of the convenience of SNOBOL. The effect 
of Line 150 is to remove all blanks from string ‘FRONT (the string 
between quotes is in fact a single blank). Line 150 works as follows. 
If a blank is found in FRONT it is replaced by nothing, since nothing 
appears to the right of the equals. That is to say, the first blank in 
FRONT if there is one, is deleted. If a blank was in fact deleted the 
test accumulator is set to success and transfer is passed back to the 
beginning of the instruction. The transfer is effected by the /S(BLNK) 
at the end of Line 150. FRONT is then scanned again for a blank. If 
one is found, it is deleted and transfer is passed again to the fronr of 
Line 150. This is continued until all the blanks are removed from 
FRONT. After a search for a blank which fails, the test accumulator 
is set to fail and control is transferred to Line 160. 

The effect of Line 160 is to append a tab to the right-hand end of 
FRONT. (The string within the quotes is in fact a single tab.) Line 


160 works as follows. The first element in Line 160 is a string variable. 


This signals the compiler that the line is an application of string match- 


ing. Since there is no second element before the equal sign, this indi-. 


cates that the contents of the first string will be replaced by the con- 


tents of the string(s) to the right of the equal sign. In Line 170 we have 


another application of string matching. Notice however the back arrow 
immediately following the first element LINE. This indicates that the 
string being matched must include the first character of LINE. The 

effect of this line is to delete the leading character of LINE if it is a 


blank and transfer control to Line 190 or to do nothing to LINE and 


fall through to Line 180 if the first character of LINE was not a blank. 


Since this character being examined was originally the sixth character of 


the source line, this instruction tests the continuation field. In Line 180 


we replace the first character in LINE, if there is one, by the numeral ] 


If at this point LINE were empty Line 180 would not alter LINE, 
Control then passes to Line 190. 

Line 190 causes the string in FRONT concatenated with the string 
in LINE to be outputted along with a CR-LF to QQTAB.TEM. Con- 
trol is then passed to the top of the loop at Line 110. After all lines 
have been inputted, the read command at Line 110 will fail and control 
will pass to Line 210. The SNOBOL command in Line 210 calls for the 
input file to be renamed to a file with a null name. That is, the null 
string which is generated by two adjacent quotes, represents a null name. 
The effect of renaming an input file to a null name, is to delete that 
file. Control is then passed to Line 220. 

In Line 220, the SNOBOL command .ROUT FILNAM renames the 
output file to be the name contained in FILNAM. Control is then passed 


to Line 70 so that the user can insert a new file name for processing. 


EXAMPLE THREE 


As a final example, let us consider the SNOBOL program below which 
inputs a list of words, written one word to the line, and outputs the list 
as a continuous stream of characters separated by commas, and then out- 
puts an alphabetized list in the same format. The first word of the in- 
put list is a number which gives the number of letters in the longest word 


in the list. This number will be deleted before the list is printed. 


Explanation of Example 3 

In Lines 30 through 50 the program types an asterisk, accepts 2 
file name typed on the Teletype and opens that file. If the file 1s not 
foand cn the Disc (Drum) the program is restarted. In Line 70 we read 
the first line of the drum input file. If the file is empty, the test accumu~ 
lator would be set to "fail" and the transfer at the right-hand end of Line 
70 would be effected. The string in SIZE is considered by th:s pro- 
gram to be a SNOBOL integer. Any SNOBOL string can be considered to 
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93012 
ABAZA 
833952 
83040 
09052 
08355 
08963 
002972 
80082 
J0090 
33183 
A311 
90120 
BO150 
09142 
90150 
923162 
93172 
30128 
301990 
92208 
93210 
08228 
48250 
4B24B 
03253 
38282 
94270 
AI284 
20290 
20383 
286310 


43320 


4333523 
AIS 4B 
33353 
413364 


* 


EXAMPLE -3- 


sALPHASETIZATION USING A RADIX SORT TECHNIQUE 


BESINe  NTYPE “x” 
oACCEPT FILNAM 
~OPIN FILNAM /FCBESIN) 


sxZAD YAXIMUM SIZE OF WORD 
START: READ SIZE /FC3E5IN) 
SIZE > "9" /FCSEGIN) 


' gREAD ALL WORDS - SEPERATE WITH COMMAS. 


LIST = 
READER: READ WORD /FC(TYPE1) 
LIST = LIST WORD “,” /CREADER) 


sTYPe LIST TO 3F ALPHASETIZED 
TYPE!: .TY2E “LIST TO 3E ALPHABETIZED: ™ LIST 


DECSI7: SIZE = SIZE SAS 


SI7E < "@” 
/SCFINAL) 

GETWRD: LIST *¥ORD* ",” = /FCREMAKE) 
WORD *xHEAD/SIZE* *PIT/1* 
/F(STO3IN) 

GPIT = GPIT WORD ",” /C(GETWRD) 

STOBIN: SIN = BIN WORD ",” /(GETWRD) 


REAAKES SIN *LIST* = 
ALPHA = “ABCDEFCHI JKLMNOPQRSTUVWXYZ” 
NXTLET?: ALPHA *PIT/Ii« = /FCDECSIZ) 
LIST = LIST @PIT 
@PlIT = /CNXTLET) 
FINAL: .TYPE “ALPHA3ETZED LIST: ™ LIST 
/(BESIN) 
~END 


EXAMPLE -3- (cont'd.) 
rC | 


ot SNIBOL 


& KTYAMP3 VACEEYQYI3 7/1 44 
VD UNDEFINED SNO3U9L 1DD5RESSES 


=XIT 
1C 


21 YATAO 


KIIGSENXA IP3LNTL,OOA MS ZXAMP3, MAC 


Gs) 


THEAD ARE NO ERR 
2ROGRAM BAEAK IS 433613 
5K CORE SED 


x tC 


ot LOADER 
*/SSNOCPS,EXAMPSLAEL/D/S~ 


LOAIER 
CORE 4 


EXIT 
+C 


sas Jat Naess 


JOs SAV) 
tC 


a ee oe 
*RASTESTeHTTY: 


Tes i150 
: vs @ ee 

AL WYAS 
ALRE 
KAS 
ACCEPT 
3=CAUS: 
HOSPITAL 
t7 

i 


3 DRM EXAMPS 4 
JO3 SETUP 
+ 


Sl 
*fEST 


LIST TO BE ALPHASETIZED: 


ALPHAZETZED LIST? ACCEPT, ALWAYS,S3ECAUSE, SIFT,HOSPITAL, TESTED, TRIED, XA: 


tC 


EXAMPLE -3- (cont'd.) 


TESTED, TRIED, ALWAYS, GIFT, XMAS, ACCEPT, 3ECAUS= ,!: 
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be ar integer by reading the initial numerals as a decimal integer. The 
e-d of the string or the first non-numerical character ends the number. 
The only exception to this rule is that a leading minus sign makes the 
number which follows negative. For example, "'-", "0", ''"', ''000" 
0.3" "0" ''4+3" "--5" are all considered to be zero as a SNOBOL inte- 


ger Also, "+5", "-5.5", "5A" are all considered to be -5. In Line 


80 we test to see if SIZE is indeed a positive integer. Notice the use 
of the literal zero. This statement could also have been written 
“T'&@= SIZE /F (BEGIN) 


The operation of Line 80 is performed as follows. A routine in the 
SNOBOL operating system (SNOOPS) evaluates SIZE as a SNOBOL 
.nteger. Similarly it evaluates a string consisting of a zero alone as a 
SNOBOL integer and sets the test accumulator to "success", --if the value 
of SIZE is currently greater than zero. 

The transfer at the right hand side of Line 80 is effected if the first 
characters in SIZE are not considered to be a positive integer. In 
Lines 110 through 130 we read all the words and make them into a single 
string separated by commas. In Line 160 we type the list to be alpha- 
betized on the Teletype. In Line 180 we reduce the size of SIZE by 

1 and Line 190 we test to see if SIZE is negative. In Line 200 we 
transfer to the final output coding if SIZE is negative. 

The loop from Line 220 through Line 260 is used to sort all the 
words on LIST into 27 ''bins.'' There is one bin for each letter of 
the alphabet and the 27th bin for words which are less than or equal to 
the number currently stored in SIZE . The initial use of this loop 
sorts all words whose length is less than the maximum length into the 
‘u1n called BIN, and sorts the longest words into each of the bins 
A. B, C,...,2 according as its last letter is A, B, C,...,Z respec- 


t.vely. On the second application of this loop, the words of maximum 


length are sorted on the next-to-the-last letter, the words of one less 
thin m*ximum length are sorted on their final letter and words shor~- 
ter than that are sorted into BIN. 

In Line 220 the first word in LIST is copied into WORD and 


the init'4l word and comma are deleted from LIST. However, if 


LIST wis empty, transfer is made to Line 280. | In Line 230, we see 
two different types of application of fixed length fillers. The first fil- . 
ler, 

*HEAD/SIZE* ; 


1s interpreted as follows. SIZE _ is interpreted as a SNOBOL integer 
sid that number of initial letters of WORD is copied into HEAD . 
The second filler is a fixed length filler where the size is given by ar 


integer so that the next character after the last character read into 


HEAD is stored in PIT. If this is successful the test accumulator is 
set ta ‘‘success.'' However, if WORD is too short, i.e., does not +) 
contain SIZE+1 letters, the test accumulator is set to "fail.'' Line 


240 effects a transfer to Line 260 in the case that WORD _ was too 

short. In Line 250 the list whose name is in PIT is lengthened by 

1dding the contents of WORD and acomma. The commercial -at 

s.gr (@) in front of PIT indicates that it is not the contents of PIT 

but the contents of the string named in PIT that is being referred to 
this feature is known as indirect naming. If the string in PIT 


happened itself to be a variable name preceeded by a commercial at- 


s.gn the indirect naming would go down one level deeper. Indirect 
saming can be used to an arbitrary level of indirectness.  Indirect- 
ness cit also be used with addresses. That is, a transfer can be 


m-:de to a label named in a SNOBOL string. Line 260 is entered in 
-the case that WORD is shorter in length than the number in SIZE. 


In Lines 280 through 320, LIST is regenerated by putting the shorter 


words followed by all of the words in List A, List B,..., List Z. 

Lire 280 represents an interesting use of a filler. The action here 

iS *O ptt the contents of BIN into LIST and to empty BIN. 

Not <e that fillers are used to match the smallest possible string ex- 
cept in the case that a filler appears at the beginning or end of the 
substring being matched. A filler at the beginning or end of the sub- 
string if it were to be as short as possible would dimaye be empty. 
Therefore, it is convenient to have an initial filler to include all of the 
beg-nning of the main list and a filler at the end of the substring to in- 
clude all of the characters at the end of the main list. Hence, a sub- 
str.ng consisting only of.a filler matches the entire main string. In 
the case of Line 280, LIST matches all of BIN and the equal sign 
.nd:cates that the contents of BIN is to be deleted. Lines 300 through 
320 are the loop which adds the 26 lists to LIST. After 26 passes 
through the loop, control is transferred to Line 180, where SIZE is 
decremented, etc. When SIZE has been decremented to -1, control 
transfers to Line 340 where the alphabetized list is printed out. After 
this typeout.control has passed to the beginning of the program for the 
user to type in a new file name. 


We now describe the syntax of the Applied Logic implemen- 
tation of SNOBOL. In most respects we have followed the syntax of 
the original implementations of SNOBOL. If we have diverted, it is 
because we have the ASCII character set available and because the 
ASSEMBLE/COMPILE feature (i.e., the in-line use of the MACRO assembly 
language code) makes it desirable to have the syntax for comments and 


labels be compatible with MACRO assembly language. 
Liter sl Strings 
' ys a literal string representing the null (or empty) string. 


é mr is a literal string representing a string consisting of a quote 
alone. 


3% LiLy---ky 


characters other than quotes (n>1). 


is a literal string, where Lylo.--L, are any ASCII 


NOTE: Strings are stored internally as a continuous string on 
of ASCII characters terminated by a null (a null is a 7-bit + 
field of all binary zeros). 


Names 


Any word using only letters of the alphabet, numerals, percent 
sign or period--that does not start with a numeral or a period is a 
name. Names are used as "String names" to name string variables and 
as address labels. The compiler allows names of any non-zero length. 
However, only the first six letters are used by the compiler. For ex-. 
ample, ABCDEF1 and ABCDEF2 are both legal names which are considered 
to be the same name by the compiler. 


Labels 
A name which is initially placed in a line and immediately ‘ 


followed by a colon is a label. Care should be taken not to allow 
label and string names to coincide. 


String Element 
1. <A string name is a string element. 


2. A literal string is a string element. 


Substring Element 


1. A string element is a substring element. 


2. If S and T are string elements but not string literals and if n is a 
positive decimal numeral, then 


a) -*S* 

b) *S/n* 

c) *S/T* and , 
d) &S& 


are substring elements. 


3. If Sy. Sooeee9Sy (n>2) are string elements, SytSyteeetSy is a 
substring element. (For example, "VAR"! "12"!VAR3!"345" is a 
substring element.) The ‘ is read as OR. 


NOTE:- Substring elements are concatenated together to make a 
pattern which is to match a designated character string. 


Comments 


C 
A semicolon, not captured within part of a literal, together with JB 
all characters to the right of such a semicolon, are considered to be 
a comment. 


ro ae 


Format: 


Action: 


Format: 


Action 


Action 


Action 


Format: 


Action: 


Format: 


Action: 


Format: 


Action: 


Format: 


Action: 


FSac Sa: iwies Sig: “ONS 

Sets test accumulator to “success’ if Si S2 
matches a consecutive substring of F; otherwise 
sets test accumulator to “fail.” 


ES, Sp ...- S, = Ri Rp Ry (m>0, n>0) 
(n=0): Replace E by R, R, .... Ra: No effect on 
test accumulator. 
(m=0): Delete first substring of E matched by 
S$, .--- Sn and set test accumulator 
to “success;" if no match, leave E in 
tact and set test accumulator to "fail." 
(n>0, m>0): Replace first substring of E matched 


Sn with the string 
2 Rm and set the test 
accumulator to "success;" if no 
match, leave E in tact and set test 
accumulator to "fail." 


by S, S, 
R, R 


ES, S, ..-. S,=A 


‘i (n>0) 


Same as 2. above with the character string 
which represents the value of A used in lieu 


Of R, Ry scc% Ra: 


Sy] Sia) 
(O<i<n, m>0) 


ES hace SS Ry Be cant 


Same as 2. above except that only the portion 
of the substring matched by S, S, .... S, is 
replaced or deleted. . 


ES, S2 SS Siey. eeee Sy A. -(O<ien) 
Analogous to 4. above. 

ES: $2 +++) Sif Sqaq Sy ee 
R, R, ee Rr, (O<i<j<n, m>0) 


Analogous to 4. above except that the substring 
of E corresponding to S; a is replaced 


or deleted. + 


eee iQ” @ 


S 


Format: 


Action: 


Format: 


Action: 


Format: 


Action: 


BS, Sea pee Se ae 


(0<i<j<n) 


Analogous to 6. above. 


ee Bi ec SL S541 


(O<j<n, m>0) 
Analogous to 4. above 


of E corresponding to 
or deleted. 


ES 5 155: ded Shoe 


(O<j<n, m>0) 


Analogous to 8. above. 


sd) s54 eae = 


S =R, R, .... R 


except: that the substring 


S.. 2... S 1S replaced 
J+] n 


In ord r to understand the operation of the string command, con- 
1 sider the command 
& NAME SUB1 SUB2....SUBn = REPI REP2...REPm 
This statement is executed in the following manner: 


SUB1, SUB2...SUBn are to indicate a contiguous substring of NAME 


which is to be replaced by the string obtained by concatenating the 
strings represented by REPI1, REP2...REPm. If this substring match 
is successful, the test accumulator (T=17) is set to "success" (0); if 
the match fails, the test accumulatc is set to "failure" (-!). 
The matching algorithm proceeds as follows: 
The substring elements of the form *S* , *S/j* , *S/T*¥ , and 
&S& represent fillers whose lengths are respectively "the shortest 
possible, "' "exactly j(j21) characters," "exactly t characters--where 
T has value t as a SNOBOL integer--,'' and "the biggest possible.'"' 
The matching algorithm then attempts to find the left-most match for 
SUB1 and NAME and then proceeds to find immediately after that a 
match for SUB2_ and so on through SUBm. If for SUBi+l , a match 
& is not possible, an attempt is made to extend thé right-hand end point of 
SUBi one character to the right--in the case cf &S%, contract the 
right end point one character to the left. If this 1s not possible 
then SUBi-1] is considered, etc. If this fails back throuch SUBI 
the left-hand end point of SUB1l is incremented 1 to the right 


; and the process is restarted. 

If SUB1 SUB2...SUBn does not appear in the command string, 
the whole string NAME is considered to be matched. if REP1] REP2 
...REPm _ is missing the matching substring of NAME i: to be ae- 
leted. If = REPI REP2...REPm is missiny (then n>0O), this opera- 
tion is used strictly for setting the fail-success accumulator. For ex- 
ample 
| NAME = ",A,B,C," 

NAME " " *VAR* ",' = 


results in NAME beire B,C, and VAR being A_ and the test 
accumulator is set to "success." 

NAME = ",A,B,C," 

NAME :DELIM/1* *VAR* DELIM 
sets DELIM to , and sets VAR to A. Again, the test accumu- 
lator is set to "success." 

It has been found to be convenient to make replacements within 

a proper substring of the substring being matched. SNOBOL delimits 
such a substring by including one or both of € or J between the SUBi's. 
A missing { is tacitly assumed to appear before SUBI1; similarly a 
missing } is assumed to follow SUBn. For example 

NAME = ",A,B,C," 

NAME *DELIM/i* *VAR* J DELIM = 
results in NAME being set to ,B,C, and VAR being A_ and 


DELIM being ., . Note that the comma preceeding B is not 


replaced (deleted) since replacement is restricted to the bracketed 


substrings. 

SUBi can, .n addition, be a disjunction of string names or liter- 
als. For example, replace the right-most occurrence of period or 
comma by a semi-cclon: 

NAME = ",A,B,C," 

NAME. &VAR&. £09 ay" 
eaedipe in NAME being ,A,B,C, in VAR Senne »A,B,C and in 
the test accumulatcr heing set to ‘“'success." 

An additionz! feature of SNOBOL is the anchor mode for matching 
the substring. In this mode the matching substring must include the first 
character of the string. This mode is indicated by an < immediately 
follewing the first string name. Spaces and tabs are ised in SNOBOL as 
syntactic delimiters for readability and in a few cases to remove ambig- 
uities caused by nzmes running together. The delimiters however are 


not needed in general except in cases of such ambiguities. 


=. OG vs 


+) 


‘“.rithmetic Relations 


Let A 

A == -B | (is A 
A # B (is A 
A< = B (is A 
A < B (is A 
A > B tis A 
A>- B {.5 A 


are arithmetic relations. 


arithmetic terms. 


"success" or "failure’ according as the relation hui¢s or cves nc: hold 


Transfer Command 


Let L, and L, be SNOBOL names which are used as labels 
or are string names preceeded by a commercial-at sign or a dollar 
sign. Then 

/(L,) (transfer to L) 

/S(L,) {transfer to L, if test accumulator is ''success'') 

/F(L,) {transfer to L, if test accumulator .s 

/F(L ISL; ‘transfer to L, if test ac. .unulator is 

i 
transfer to L 


/S(L,)F(L,) 


a:e ali transfer commands. 


cated to the right of a string command, 


SNOBOL command. 


SNOBOL Command: 


Let Bisa sss4cE 
n 


l 
Ler L 


at sign or a dollar sign. 


and B_ be string elements. 


Note that A 


An arithmetic relation sets the 


be string elernents. 


Then 


equal to BB ?) 


unequal to B ?)} 
less than or equal to bh 7% 
less than B ?) | 


4) 


greater than or «qi! 


greater than B 


and B 


2 


If desired, 


Let 


B?: 


#n arithinetic relation, 


"fail'') 


"success". 


Ora 


oot NOt the:-selves be 


-¢st accar-culator to 


if test accumulator is "failure'’) 


S be a string name. 


tb. 


a transter command may be l::.- 


be a label address or a strinp nan.e preceeded by 3a commercial 


The legal SNOBOL commands have the following 


formats. 
- TYPE 
-NTY PE 
ACCEPT 
CHiN 
gt ET 
ye OE: 
WROTE 
SAW RITE 
-RreAD 
2. LIN 
~£LOUT 
RIN 
-ROUT 
.SNIN 
.; SNIN 
SNOUT 
. SSNOUT 
. PUSHS 
. POPJ 
. POP 
.MACRO 
. SNOBOL 
.t XLT 
2 FD 


Use 


of SNOBOL commands will be described below. 


> 
wi =) 
gE. E, E saps 
EF... E nz1) 
E 2 ( 3 
S 
S 
S 
S 
> 
See. ks ree | 
i | 2 n (r ) 
> E_...E >it 
a 2 n © 


Wi 


TYPE 1Bs se 2 (n> 1) 
1 2 n _ 


The strings referenced by the string cieients which appear to the 
right of a .TYPE = statement are concatenated tovether and outputted on 
the Teletype and a terminating CR-LF is typed. For example 

~TYPE “A" "B 
Cc" 

- TYPE: =D" 
outputs 
AB 
C 
D 


on the Teletype. This command does not affect the test accumulator. 


-NTYPE E, E....E (n> 1) 
] 2 n _ 


The .NTYPE_ statement is similar to the ,TYPE_ statement 
except that the terminating CR-LF is not affixed. For example 


~NTYPE “A" "B 


Cc"! 
» TYPE eB 

outputs 

AB 

CD . 

on the Teletype. This command does not affect the test accumulator. 


~-ACCEPT S 

This statement readies the Teletype for input, deletes S-, reads 
in the first line of input into SS. In both the ., ACCEPT = statement and 
the .READ statement described below the input is terminated by a 


character which represents a vertical movement. These characters are 


the line-fecd, form-feed, vertical tab, and the line-feed generated by a 
carriage return (recall that the return character generates a CR and a 
LF). In the case that the input line is terminated by a CR-LF or a 
LF these terminating characters are deleted from the input line. The 
vertical tab and the form feed hcwever are stored with the line. It has 
been found that these conventions for input are very convenient in 


SNOBOL. This command does not affect the test accumulator. 


~OPIN S 

| This statement opens an input file on the Disc (Drum) whose name 
is contained in S. For example . 

» OPIN "ABC. EXT" | 

looks on the user's directory for the file ABC.EXT. The operating 
system, SNOOPS, complains if the file is not found or not accessible 
The test accumulator is set to ''success" if the file is found and is 
accessible. The. test accumulator is set to "fail" in the contrary case. 


At most, one input file can be opened at a time from the Disc (Drum). 


~OPOUT S 

This statement opens an output file with the name contained in 
S . At most, one output file on the Disc (Drum) can be opened at any 
time. This command sets the test accumulator to ''success" if the file 


was successfully opened and to failure in the contrary case. 


~APOUT S 

This statement is analagous to the .OPOUT statement except that 
the file named in S_ should be an existing file and output is to be ap- 
pended on to the end of that file. SNOOPS complains if the file named 


in S does not exist. 


o@ 


.WRITE E._ E....E (nZ21) 
l 2 n 


The .WRITE_ statement is analagous to the .TYPE  staterie:t 
except that output is to the Disc (Drum) file opened by the last ,OPOUT 
or .APOUT statement. In both the .TYPE and .WRITE statements 
a CR-LF is affixed to the end of the line if the characters typed do not 


terminate in a vertical tab or a form-feed. The test acctimulator is not 


affected. 
-NWRITE E. E....E (n21) 
l 2 n 


This statement is identical to the .WRITE statement except that 


in no case is a CR-LF affixed. 


.READ S 

This statement is analagous to the .ACCEPT statement except 
that input is the next line of the Disc (Drum) file opened by the previous 
.OPIN statement. If no Disc (Drum) input file is open SNOOPS com- 
plains. The test accumulator is set to "success" if a line was success- 


fully read. In the contrary case, the test accumulator is set to"fail." 


~CLIN 


This statement closes the input file. 
~CLOUT © 


This statement closes output file. It is not necessary to execute 
the .CLIN or .CLOUT statements unless one is going to open a new 


input, respectively output, file on the Disc (Drum). 


.RIN S 
This statement closes the input file currently open and renames 
it to the name in S. If S_ contains the null string the input file is 


deleted rather than renamed. 


~-ROUT S 


This statement is alalagous to the .RIN_ statement except that it 


renames or deletes the output file opened on the Disc (Drum). 


« SNIN 
The normal mode of reading input from the Disc (Drum) strips 
sequence numbers from the lines if they appear. This statement acts 


as a switch which initiates the feature that sequence numbers are read 
in as part of the line. The format of the sequence number is as fol- 
lows. The first five characters are numerals; the sixth character is 
a tab unless the content of the line is to be empty, in which case the 
sixth character is a line-feed. Hence the .SNIN feature will cause 
»~-READ to read in six characters plus the remaining characters of a 
line (or read in exactly five numerals, since the terminating line-feed 
is stripped, in the case that the line represented an empty line with a 


sequence number). The remaining commands do not affect the test ac- 


cumulator. 
| NSNIN . Q 


This statement turns off the feature initiated by the .SNIN. 


- SNOUT 
The normal mode for output on Disc (Drum) does not generate 
sequence numbers. This statement initiates the feature that considers 
the first six characters of each output line to be a sequence number. 
The format for sequence numbers must be adhered to if the output file ' 
is to be used with other programs on the Applied Logic system since | 
the .WRITE statement affixes a CR-LF. It is always safe in this 
case to use a tab as the sixth character. The user who uses the 
.NWRITE statement should take care not to get a seven-character line 


consisting of five numerals, a tab, and a line-feed. 


. NSNOUT 
. This statement turns off the feature initiated by the .SNOUT 


statement. 


»PUSHJ L_ 

This statement causes a transfer to L_ or in the case of in- 
indirect addressing to the label named in L. The address otf the 
statement immediately following a .PUSHJ = statement is saved on a 


push-down list. 


. POPJ 
This statement pops the address pushed on by the last .PUSHJ 


statement and transfers to the address. 


e POP 
This statement is used to pop off the last address pushed on the 
push-down list by the last .PUSHJ statement. However, no transfer 


is made. 


. MACRO 
This statement is a signal to the SNOBOL compiler to consider 


subsequent lines of coding to be MACRO assembly code. 


- SNNOBOL 

This statement is a countermand to the SNOBOL compiler revok- 
ing a previous .MACRO statement. Lines subsequent to this statement 
are considered by the SNOBOL compiler to be SNOBOL source. If the 
first character in a line is an up-arrow (f), the line is considered to 
be MACRO code if .~SNOBOL is currently in effect, or to be SNOBOL 
code if .MACRO is in effect. 


. EXIT 
This generates an exit call to the system, closes all open files, 
and results in the user being put in monitor command mode and the 
“Teletype responding with 
EXIT 
‘GC 


This statement signals the SNOBOL compiler that there is 


no more code to follow. If this statement is omitted, the 


compiler assumes an END statement. 


a .EXIT statement. 


i) ee 


The .END statement generates 


SUMMARY OF SNOBOL FEATURES 


LITERAL STRINGS 


null string 


quote 


aaa---a" 


STRING ELEMENTS 
he 3" literal strings 


STR2 string names 


SUBSTRING ELEMENTS 

' STR2 string names 

"123" literal strings 

"123" .STR2 disjunctions of string elements 

*STR2* make STR2 be minimum length 

*STR2/n* make STR2 be length n 

*STR2/TEE* make STR2 (have length = integer value of TEE) 


&STR2& make STR2 be maximum length 


STRING COMMANDS 


LBL: NAME STRNG = REP /S(LBL1)F(LBL2); REMARKS full statement- 
replacement. 


NAME STRNG = pattern search and delection. 
NAME< STRNG pattern search in Anchor mode 
NAME SUB1)SUB2 = REPI 


NAME suB if suB2] suB3 REP2 partial substring replacements 


NAME suB1fsuB2 SUB3 = REP2 REP3 


* 36 


NORMALIZED SNOBOL INTEGERS 


"ABC" 
*7 CG" 
ere i Pa 


"993477" 


"A73" 
"383" 


Zero 


Zer) 


SUMMARY - (cont'd.) 


INTEGERS OPERATIONS 


+ -* / 


NBR + "4" 
NBR * SUM 


a Ga 


TRANSFER COMMANDS 


- " 3" 


/ (LABEL) 


/S(LBL) 


/F(LBL) 


/S(LBL1)F(LBL2) 
/F(LBL2)S(LBL1) 


unconditional 
if search or relation is successful 


if search or relation fails. 


2-way branch 


ARITHMETIC RELATIONS 


STR1 


STR] 


tt: 


STR2 
"73" 


3°36. = 


SPECIAL 


» precedes a remark 


SUMMARY - (cont'd.) 


* precedes a line of Macro 


SPECIAL COMMANDS 


code in SNOBOL mode 


- TYPE Ey E,..-E, Type a string of characters plus <RETURN> 
-NTYPE Ey E,-- EY, Type a string of characters. 
-ACCEPT S Read from TTY into S until <RETURN> 
.OPIN Ss Open the file named in S as the input file 
.OPOUT Ss Create a file named in S and open it as the 
: output file. 
.APOUT S Open the end of the file named in S as the output 
file. 
~WRITE E, Eo. ET Write a string of characters plus <RETURN> 
in the output file. 
.READ Read from the input file one record into S. 
SCLIN Close the input file. 
.CLOUT Close the output file. 
RIN S Close the input file and rename it as S. 
ROUT S Close the output file and rename it as S. 
.SNIN Set input mode to accept sequence numbers. 
-NSNIN Cancel the sequence no. input mode. 
° - SNOUT Set output mode to write sequence numbers. 
, »NSNOUT Cancel the sequence number output mode. 
. PUSHJ SUBR Call subroutine named SUBR 
POPJ Return from subroutine. 
.POP Delete return address from last subroutine call. 
- MACRO Set mode to accept MACRO code. 
SNOBOL Return mode from MACRO to SNOBOL mode. 
EXIT Exit from running program into ALC monitor 
& command mode. 
END Cenotes end of SNOBOL coding for a program. 


ae ye 


